
METHOD FOR CONSTRUCTING ERASURE 
CORRECTING CODES WHOSE IMPLEMENTATION 
REQUIRES ONLY EXCLUSIVE ORS 

DESCRIPTION 

5 BACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention generally relates to codes for correcting erasure 
errors and, more particularly, to constructing codes for which the encoding 
and correcting algorithms can be executed fast or "on-the-fly". The invention 
10 has particular application in reading data from arrays of hard disks in data 

processing systems. 

Background Description 

In some applications, for example in RAID (Redundant Array of 
Inexpensive (or Independent) Disks) systems, it is necessary to construct codes 

15 for correcting erasure errors; i.e., errors whose location is known. It is also 

desirable that both the encoding and the correcting algorithms be executed 
fast. The usual encoding and correcting algorithms operate on bytes (or as 
many bits as the dimension of the field), and thus require the breaking of the 
stream of data into small chunks plus special circuitry to perform the field 

20 operations. This is time consuming when done on a general purpose 

microprocessor, and therefore specially designed chips, such as ASICs 
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(Application Specific Integrated Circuits), are used for the execution of these 
algorithms. 

Storage systems have relied on simple erasure codes (e.g., parity, 
mirroring, etc.) to protect against data loss. However, disk drive reliability has 
5 not increased as fast as the drive capacity has increased, creating significant 

vulnerabilities for simple codes. The industry-wide shift to SATA (Serial 
Advanced Technology Attachment) based hard disk drives (HDDs) will make 
this more of an issue. Therefore, significantly stronger erasure codes are going 
to be required for storage systems. 

10 It is known that Reed-Solomon (RS) coding is used in RAID systems. 

See, for example, the article by James S. Plank entitled "A tutorial on Reed- 
Solomon Coding for Fault-Tolerance in RAID-like Systems", Software - 
Practice and Experience, Vol. 27(9), September 1997, pp. 995-1012. The 
author describes a three data, three check arrangement as an example of how 

15 to produce RS encoding and decoding. He shows how to generate the 

appropriate Vandermonde matrix, how to use it to generate three checks, and 
then how to use it to decode fi-om three devices. However, the author states on 
page 1008: *To the author's knowledge, there is no parity-based scheme that 
tolerates three or more device failure's with minimal device overhead." 

20 It is highly desirable to have'the encoding process be as efficient as 

possible, both in terms of operation per data byte transferred and required 
memory operations. Ideal solutions would rely only on exclusive OR (XOR) 
operations, with no branching. The EVENODD code is one such code, but it 
has some significant limitations. First, there are known solutions only up to 

25 distance four. Second, the size of a data set on a drive must be a multiple of a 

prime number minus one, and the number of data drives must also be less the 
prime number. The result is a rather inflexible code, and usually requires 
operating a rather large data set, since 3, 5 and 17 are the only small primes 
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where the prime equals 2^+1^ allowing standard length data sectors. The data 
operations are performed by mixing data words from (prime minus one) 
locations on each data drive, necessitating a large register set or many memory 
operations. 

5 U.S. Patent No. 5,271 ,012 to Blaum et ah discloses a method for 

encoding and rebuilding the data contents of up to two unavailable direct 
access storage devices (DASDs) in a DASD array. This method uses an 
example of the EVENODD code described above. Also relevant are U.S. 
Patents No. 5,33,143 and No. 5,579,475, both to Blaimi et al., which disclose 
10 similar methods for coding and rebuilding data from up to two unavailable 

DASDs in a DASD array. 

SUMMARY OF THE INVENTION 

It is therefore an object of the present invention to provide a general 
solution to providing erasure correcting codes for distances greater than two 
1 5 using only XOR operations. 

According to the invention, any code over a finite field of 
characteristic two can be converted into a code whose encoding and correcting 
algorithms involve only XORs of words (and loading and storing of the data). 
Thus, the implementation of the encoding and correcting algorithms is more 
20 efficient, since it uses only XORs of words - an operation which is available 

on almost all microprocessors. 

The preferred embodiment of the invention is a computer implemented 
method for correcting four or more erasure errors whose locations are known. 
The method first converts a code over a finite field of characteristic two into a 
25 code whose encoding and correcting algorithms involve only exclusive OR 

(XOR) operations of words. Data is read from main volatile memory and 
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encoded using only XOR operations to generate a correcting code. The data 
and correcting code are then stored in an auxiliary array of non- volatile 
storage devices. Data and correcting code are read from the auxiliary array of 
non-volatile storage devices. Erasure errors in the data read from the auxiliary 
5 array of non-volatile storage devices are detected and, using only XOR 

operations, generate reconstructed data is generated. 

According to one aspect of the invention, there is provided an 
encoding and correcting method which can be performed using only XOR 
operations on words for error correcting codes with four or more check 

1 0 symbols which can correct as many errors as there are check symbols. 

According to another aspect of the invention, there is provided an encoding 
and correcting method which can be obtained by transforming encoding and 
decoding matrices over GF(2"), the Galois Field of 2" elements for n greater 
than one. This method can be performed using only XOR operations on 

1 5 words. According to a third aspect of the invention, there is provided a code 

whose encoding a decoding involve only XOR operations of words that is 
specific to (3, 3) code of distance 4. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The foregoing and other objects, aspects and advantages will be better 
20 understood from the following detailed description of a preferred embodiment 

of the invention with reference to the drawings, in which: 

Figure 1 is a block diagram illustrating. and encoding system for a 
RAID system; 

Figure 2 is a block diagram illustrating a data reconstructing system for 
25 reading data from the RAID system; 

Figure 3 is a flow diagram illustrating the logic implemented in the 
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encoder of the system shown in Figure 1 ; and 

Figure 4 is a flow diagram illustrating the logic implemented in the 
decoder of the system shown in Figure 2. 

DETAILED DESCRIPTION OF A PREFERRED 
5 EMBODIMENT OF THE INVENTION 

Referring now to the drawings, and more particularly to Figure 1, there 
is shown a block diagram illustrating and encoding system for a RAID system 
in which the present invention may be implemented. Data stored in main 
memory 10, such as volatile random access memory (RAM), of a computer 

10 system is read to an auxiliary storage RAID system 12 for non- volatile 

storage. At the same time, the data is read to an encoder 14 for encoding the 
erasure correcting codes. The encoder can be implemented by suitable code in 
a microprocessor of the computer system. These erasure correcting codes are 
also stored in the RAID system 12. In the example illustrated, the RAID array 

15 comprises three data disks, 12) to I23, across which the data is striped, and 

three disks I24 to 12^, which store the checking part of the erasure correcting 
code. 

Figure 2 is a block diagram illustrating an example of reconstructing 
data from the RAID system 12 for the case of two failed disks, disks 122 
20 125, for example. The data from the four non-failed disks, disks 12„ I23, I24 

and 12^, are read out to a correcting circuit 20 for reconstructing the data. This 
circuit can be implemented by suitable code in the microprocessor of the 
computer system. The reconstructed data output from the circuit 20 is stored 
in memory 10. 

25 We will start by describing a particular code which is constructed by 

the method according to the invention - a code which is of interest in its own 
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right - and then describe the general methodology. This code is based on a 
code of six symbols, Xq, X3, x^^ and X5, each of which is an element of 
GF(4), the Galois Field of four elements (see James S. Plank, "A Tutorial on 
Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems", supra), 
5 and where Xq, a:, and X2 are the information symbols and jcj, x^ and x^ are the 

check symbols. The check symbols are defined by: 
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where a is an element of GF(4) which satisfies the equation l+a+a^=0. 

This code has distance four; that is, it is possible to correct any three 
(or fewer) erasure errors. For example, if jcq, x^ and x^ have failed, then: 



Xo=X4+ax,+a^A:2, thatis, jCq = [ 1 a ] 



4. 1. 2 



We now replace, in and in ^„ each x^ {i = 0, 1 , 2, 3, 4, 5) by the pair of 

1 0 



words = (w, o» in A we replace 1 by the 2x2 matrix 



0 1 



, a by the 



2x2 matrix 



0 1 

1 1 



, and by the 2x2 matrix 



1 1 
1 0 



, and X = AX becomes 

■"C """1 
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, that is = riAW^. 



This code, which we call the (3,3) code, has the property that even if all the 
information in any three of the words is erased, the data can be recovered. 
There are many ways to compute = • One of them (the 

5 preferred one, since it uses the smallest number of XORs) is as follows (note 

here the symbol stands for XOR of two words): 

= w, , e W2^i 

10 ^4 ~ >^2,l ® ^0,1 

^3.0 = ^0.0 ® V, 
W3 , == , © V2 
^5,1 = V4 ® 

W4 0 = V3 e V2 

15 ^4,1 "^2 ® ^5.1 

To recover errors, we use the same procedure. For example, if we 
know that Wq, w, and have failed and we want to recover Wq, we substitute 
the same 2x2 matrices for the entries of and W4, Wj, W2 for JC4, jc,, X2 in^ , 2 
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and = BX^^ 2 becomes: 
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and Wq can be recovered by performing five XORs. 

In general, when we have a linear code C over GF(2"), the Galois Field 
5 of 2" elements for n greater than one, which can correct up to e erasure errors, 

we convert it to a code C which can correct up to e erasures in words, and 
whose encoding and correcting can be performed by XORing words as 
follows: 

1. The encoding for code C is of the form = AK^ , and each of the 

10 corrections is also of the form = B^X, where A and the BfS are 

matrices over GF(2"). 

2. Choose a representation, r, of GF(2"). The representation assigns an 
«x« matrix, r(a), for every element a in GF(2"), whose elements are in 
GF(2); i.e., are "0" or "1". 

15 3. To obtain the encoder of code C, substitute the matrix r(a) for every 

element a ofA^ to obtain the matrix r{A)y and substitute for x, in X\ 
and in where = {w^q, w,. „ . . ., w, |)' to obtain and W^, The 
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encoder of code C is = H4)W\' 
4. Similarly, by substituting r(a) for every element ^ of 5^ to obtain K^,), 
and substituting Wj for every element Xj of ^to obtain JVy we can 
recover by using = riJS^W. 
5 Since the entries of matrix A and of the B^s are elements of GF(2), i.e., "0" or 

"1", it is clear that both the encoding and recovery can be done by XORs of 
the words w^j. It is also clear that code C, by its construction, can recover the 
loss of up to e of the iv,s. 

Figure 3 is a flow diagram illustrating the logic of the implementation 
10 of the encoder 14 in Figure 1. The input to the process is Wq, fT,, blocks of 

In words, and the output is W^^ W^, check blocks. The process begins by 
initializing the index, i, to zero in function block 3 1 . A processing loop is 
entered at decision block 32 where a determination is made as to whether / is 
greater than or equal to 2". If so, the process exits; otherwise, the following 
15 exclusive OR operations are performed in function block 33: 

V, = PF,(0 e W,{i) 
V2=fF,0+l)® JFzO+l) 

V4= PFjO+l)® W'oO+I) 
20 This is followed by performing the following exclusive OR operations in 
function block 34: 

»'3(0=^0(0ffiV, 
Fr3(/+l)=fFo(/+l)®V2 

W^{i+1) = V4 © V, 
25 W^{i) = V3 e V2 

W^(,i+\) = V2 © Wi{i+\) 

The index, /, is incremented by two in function block 35, and the process 
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loops back to decision block 32. This processing loop continues until i is 
determined to be greater than or equal to 2". 

Figure 4 is a flow diagram illustrating the logic of the implementation 
of the decoder 20 in Figure 2 where disks llj and I25 have failed, for 
5 example. Corrections of other failure patterns of at most three failed disks can 

be performed in a similar maimer for this example. The input to the process is 

W2, blocks of 2" words, and the output is data block. The process 
begins by initializing the index, i, to 0 in function block 41 . A processing loop 
is entered at decision block 42 where a determination is made as to whether / 
10 is greater than or equal to T", If so, the process exits; otherwise, the following 

exclusive OR operations are performed in function block 43: 
V, = JF,(/+1) e W^{i) 
V2 = Wli) e W^^X) 
V3 - W.ii^-V) © W,{i) 

1 5 This is followed by performing the following exclusive OR operations in 

function block 44: 

W^{S) = Vi ® V2 

The index, i, is incremented by two in function block 45, and the process 
20 loops back to decision block 42. This processing loop continues until / is 

determined to be greater than or equal to 2". 

The method according to the invention has been implemented in 
software and compared with other known solutions. The software was written 
in the C programming language, and the performance measured on an IBM 
25 ThinkPad® laptop computer with a 1.7GHz Intel Pentium® 4 (P4) processor. 

The P4 processor has a two level cache and pipelined architecture. We 
therefore examined the performance of the codes as a function of the buffer 
size for each hard disk drive (HDD). Small buffer sizes would show the 
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performance of the 512KB L2 cache, while larger sizes would show the 
performance of the underlying memory system. The following codes were 
tested: (1) a generic Reed-Solomon (RS) code, implemented using a 
Vandermonde matrix and a lookup table to perform multiplication over 
5 GF(256) and optimized for the (3+3) configuration; (2) the EVENODD (3+3) 

code; and (3) the code according to the present invention and described above. 

The results clearly showed the advantages of the XOR codes according 
to the invention over conventional RS codes. Further, the higher efficiency of 
the present invention was apparent when the performance was limited by the 

10 processor (small buffer sizes). The code according to the invention used six 

loads, ten XORs and six stores to process six data words. Thus, there were two 
memory operations and 10/6 = 1.6 XORs per word. Note that the word length 
can be arbitrary. In contrast, the EVENODD (3+3) code used 28 loads, 30 
XORs and six stores to process 12 data words. Thus, there were 30/12 = 2.5 

15 XORs per word. Again, the word length can be arbitrary. 

While the invention has been described in terms of a single preferred 
embodiment, those skilled in the art will recognize that the invention can be 
practiced with modification within the spirit and scope of the appended 
claims. 
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